A New Snr-feature Mapping for Robust Multistream Speech Recognition

نویسندگان

  • Frédéric BERTHOMMIER
  • Hervé GLOTIN
چکیده

We describe a new model of CASA labelling which assigns to each time-frequency region a probability "clean" enough to feed a multistream recogniser only adapted to clean data. This labelling process is based on the harmonicity of the speech. The probability is evaluated according to a SNR-feature mapping and the choice of a SNR decision threshold. This allows an extension of a previous method [1] based on the binary detection of noisy time-frequency regions, followed by partial recognition of clean regions. The labelling process is adapted to a new multistream recognition approach [5], since the previous probabilities serve to weight the streams' posteriors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

A Measure of Speech and Pitch Reliability from Voicing

We propose a CASA labelling method of the TF representation, which is based on the periodicity of the speech, related to the voicing. A local voicing index is estimated in four subbands after demodulation of the signal. This index is used as a reliability measure for both pitch identification and speech recognition. First, this model allows robust f0 identification thanks to the voicing index, ...

متن کامل

A new posterior based audio-visual integration method for robust speech recognition

We describe the development of a multistream HMM based audio-visual speech recognition (AVSR) system and a new method for integrating the audio and visual streams using frame level posterior probabilities. This is compared to the standard feature concatenation and weighted product methods in speaker-dependent tests using our own multimodal database, by examining speech recognition robustness to...

متن کامل

Audio-Visual Speech Modeling for Continuous Speech Recognition

This paper describes a speech recognition system that uses both acoustic and visual speech information to improve the recognition performance in noisy environments. The system consists of three components: 1) a visual module; 2) an acoustic module; and 3) a sensor fusion module. The visual module locates and tracks the lip movements of a given speaker and extracts relevant speech features. This...

متن کامل

Multistream Bandpass Modulation Features for Robust Speech Recognition

Current understanding of speech processing in the brain suggests dual streams of processing of temporal and spectral information, whereby slow vs. fast modulations are analyzed along parallel paths that encode various scales of information in speech signals. This unique way for the biology to analyze the multiplicity of information in speech signals along parallel paths can bare great lessons f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999